Exercise

Use library(ggplot2) to load the package.

library(ggplot2)

Specify the dataset.

What does ggplot(diamonds) do?

ggplot(diamonds)

Answer: SHows empty plot as there has been no instructions added to tell ggplot what to do

Add the aesthetics.

What does ggplot(diamonds, aes(x = carat, y = price)) do?

ggplot(diamonds, aes(x = carat, y = price))

Answer:Has created a graph as we have told what goes on x and y axis but have not specified the geometry (geom)

Add geometric objects

  • Add data points showing carat on the x-axis and price on the y-axis.
ggplot(diamonds, aes(x = carat, y = price)) +
       geom_point(size = 0.7)

  • Color data points by cut. (Copy-paste and extend the code chunk above.)
ggplot(diamonds, aes(x = carat, y = price)) +
       geom_point(size = 0.7, aes(colour = cut))

  • Add a smoothed mean trend line. (Copy-paste and extend the code chunk above.)
ggplot(diamonds, aes(x = carat, y = price)) +
       geom_point(size = 0.7, aes(colour = cut)) +
      geom_smooth()
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'

  • Assign that last plot to an object called obds_diamonds.
obds_diamonds <- ggplot(diamonds, aes(x = carat, y = price)) +
       geom_point(size = 0.7, aes(colour = cut)) +
      geom_smooth()

Exercise

Predict the difference between these two plots

Plot 1

ggplot(diamonds, aes(x = carat, y = price, colour = cut)) +
  geom_point() +
  geom_smooth()
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'

Plot 2

ggplot(diamonds, aes(x = carat, y = price)) +
  geom_point(aes(colour = cut)) +
  geom_smooth()
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'

Answer: Plot 1 colours msooth trend line by cut but in main chunk of ggplot, so autmoatically creates smooth trend for all separate cuts. No right answer, depends on what you are looking for!

Exercise

Trend lines

Using the ChickWeight data set:

head(ChickWeight)
## Grouped Data: weight ~ Time | Chick
##   weight Time Chick Diet
## 1     42    0     1    1
## 2     51    2     1    1
## 3     59    4     1    1
## 4     64    6     1    1
## 5     76    8     1    1
## 6     93   10     1    1
  • Create a scatter plot of weight (y-axis) over time (x-axis).
ggplot(ChickWeight, aes(x = Time, y = weight)) +
  geom_point()

  • Color by diet. (Copy-paste and extend the code chunk above.)
ggplot(ChickWeight, aes(x = Time, y = weight)) +
  geom_point(aes(colour = Diet))

  • Add a linear mean trend line for each diet. (Copy-paste and extend the code chunk above.)
ggplot(ChickWeight, aes(x = Time, y = weight, colour = Diet)) +
  geom_point() +
  geom_smooth(method = lm)
## `geom_smooth()` using formula 'y ~ x'

At this point you should be able to visually identify the diet that leads to the highest mean increase in weight.

Answer:Diet 3

  • Facet a ribbon of sub-plots, one per diet. (Copy-paste and extend the code chunk above.)
ggplot(ChickWeight, aes(x = Time, y = weight, colour = Diet)) +
  geom_point() +
  geom_smooth(method = lm) +
  facet_wrap(~Diet, nrow = 2)
## `geom_smooth()` using formula 'y ~ x'

  • Assign that last plot to an object called obds_chickweight.
obds_chickweight <- ggplot(ChickWeight, aes(x = Time, y = weight, colour = Diet)) +
  geom_point() +
  geom_smooth(method = lm) +
  facet_wrap(~Diet, nrow = 2)

Exercise

Bar plot

  • Load the ggplot2::msleep data set.
msleep
## # A tibble: 83 × 11
##    name         genus vore  order conse…¹ sleep…² sleep…³ sleep…⁴ awake  brainwt
##    <chr>        <chr> <chr> <chr> <chr>     <dbl>   <dbl>   <dbl> <dbl>    <dbl>
##  1 Cheetah      Acin… carni Carn… lc         12.1    NA    NA      11.9 NA      
##  2 Owl monkey   Aotus omni  Prim… <NA>       17       1.8  NA       7    0.0155 
##  3 Mountain be… Aplo… herbi Rode… nt         14.4     2.4  NA       9.6 NA      
##  4 Greater sho… Blar… omni  Sori… lc         14.9     2.3   0.133   9.1  0.00029
##  5 Cow          Bos   herbi Arti… domest…     4       0.7   0.667  20    0.423  
##  6 Three-toed … Brad… herbi Pilo… <NA>       14.4     2.2   0.767   9.6 NA      
##  7 Northern fu… Call… carni Carn… vu          8.7     1.4   0.383  15.3 NA      
##  8 Vesper mouse Calo… <NA>  Rode… <NA>        7      NA    NA      17   NA      
##  9 Dog          Canis carni Carn… domest…    10.1     2.9   0.333  13.9  0.07   
## 10 Roe deer     Capr… herbi Arti… lc          3      NA    NA      21    0.0982 
## # … with 73 more rows, 1 more variable: bodywt <dbl>, and abbreviated variable
## #   names ¹​conservation, ²​sleep_total, ³​sleep_rem, ⁴​sleep_cycle
  • Draw a bar plot of number of observations (i.e., rows) for each taxonomic order (i.e, one plot and one bar per taxonomic order).
ggplot(msleep, aes(x = order)) +
  geom_bar()

  • Change the angle and font size of the text for the x-axis ticks (not the axis titles). Justify the text of those x-axis ticks as right-aligned. (Copy-paste and extend the code chunk above.)
ggplot(msleep, aes(x = order)) +
  geom_bar() +
  theme(axis.text.x = element_text(angle = 90, size = rel(0.9), hjust = 1, vjust = 0.5))

  • Change the value and font size of the title for both x and y axes. (Copy-paste and extend the code chunk above.)
ggplot(msleep, aes(x = order)) +
  geom_bar() +
  theme(
    axis.text.x = element_text(angle = 90, size = rel(0.9), hjust = 1, vjust = 0.5), 
    axis.title = element_text(size = rel(1.5))) +
  labs(x = "Order", y = "Observations")

  • Fill each bar with colors, proportionally to the count of each genus. (Copy-paste and extend the code chunk above.)
ggplot(msleep, aes(x = order)) +
  geom_bar(aes(fill = genus)) +
  theme(
    axis.text.x = element_text(angle = 90, size = rel(0.9), hjust = 1, vjust = 0.5), 
    axis.title = element_text(size = rel(1.5))) +
  labs(x = "Order", y = "Observations")

From this point onwards, you may need to iteratively resize the text of the ticks and axes for readability.

  • Reduce the legend key size. (Recommendation: use unit(2, "mm")). (Copy-paste and extend the code chunk above.)
ggplot(msleep, aes(x = order)) +
  geom_bar(aes(fill = genus)) +
  theme(
    axis.text.x = element_text(angle = 90, size = rel(0.9), hjust = 1, vjust = 0.5), 
    axis.title = element_text(size = rel(1.5)),
    legend.key.size = unit(2, "mm")) +
  labs(x = "Order", y = "Observations")

  • Force the legend to be display in 3 columns. (Recommendation: use guide_legend(...)). (Copy-paste and extend the code chunk above.)
ggplot(msleep, aes(x = order)) +
  geom_bar(aes(fill = genus)) +
  theme(
    axis.text.x = element_text(angle = 90, size = rel(0.9), hjust = 1, vjust = 0.5), 
    axis.title = element_text(size = rel(1.5)),
    legend.key.size = unit(2, "mm"),
    legend.text = element_text(size = rel(0.5))
    ) +
  labs(x = "Order", y = "Observations") +
  guides(fill = guide_legend(ncol = 3))

  • Add a contour of thin black lines to the bars.
ggplot(msleep, aes(x = order)) +
  geom_bar(aes(fill = genus), colour = "black", size = 0.1) +
  theme(
    axis.text.x = element_text(angle = 90, size = rel(0.9), hjust = 1, vjust = 0.5), 
    axis.title = element_text(size = rel(1.5)),
    legend.key.size = unit(2, "mm"),
    legend.text = element_text(size = rel(0.5))
    ) +
  labs(x = "Order", y = "Observations") +
  guides(fill = guide_legend(ncol = 3))

  • Assign that last plot to an object called obds_msleep.
obds_msleep <- ggplot(msleep, aes(x = order)) +
  geom_bar(aes(fill = genus), colour = "black", size = 0.1) +
  theme(
    axis.text.x = element_text(angle = 90, size = rel(0.9), hjust = 1, vjust = 0.5), 
    axis.title = element_text(size = rel(1.5)),
    legend.key.size = unit(2, "mm"),
    legend.text = element_text(size = rel(0.5))
    ) +
  labs(x = "Order", y = "Observations") +
  guides(fill = guide_legend(ncol = 3))

Exercise

Plotting grid

  • Collate the plots that we assigend to objects through the day, as a single plot.

    • Plots: obds_diamonds, obds_chickweight, obds_msleep.

    • Methods: cowplot::plot_grid(), patchwork, gridExtra::grid.arrange().

Using cowplot.

library(cowplot)
first_row <- cowplot::plot_grid(obds_chickweight, obds_diamonds, labels = c("A", "B"), ncol = 2, nrow = 1
)
## `geom_smooth()` using formula 'y ~ x'
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
second_row <- cowplot::plot_grid(obds_msleep, labels = ("C"), ncol = 1, nrow = 1
)
super_plot <- cowplot::plot_grid(first_row, obds_msleep, labels = c("","C"), nrow = 2
)
super_plot

Using patchwork.

library(patchwork)

Using gridExtra.

library(gridExtra)
  • Export the new plot in a PDF file, and open it in a PDF viewer (e.g. Adobe Acrobat Reader DC).

You will likely need a few attempts to fine-tune the width and height of the output file.

ggsave(
  "superplot.pdf",
  super_plot
)
## Saving 7 x 5 in image

Exercise

Pair programming

  • Explore the data set ggplot2::mpg and generate the most informative plot that you can!